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ABSTRACT 

Classical  Factor  Analysis  Regression  is  a  statistical  technique  using 
factor  analysis  to  calculate  a  linear  function  similar  to  ordinary  least 
squares  regression.  CFAR  has  been  recommended  to  replace  OLS 
in  cases  where  there  is  high  multicollinearity  among  the  explanatory 
variables  and  when  there  are  errors  in  the  variables  as  well  as  when 
there  may  be  outliers  in  the  data.  Mathematical  derivation  of  the  dis- 
tribution functions  of  the  CFAR  coefficients  has  so  far  not  been  done. 
The  research  reported  here  is  a  Monte  Carlo  study  to  determine  the 
statistical  goodness  of  CFAR  compared  to  OLS.  The  results  of 
this  research  show  that  CFAR  is  superior  to  OLS  whenever  there 
is  high  multicollinearity  or  errors  in  the  variables.  The  variances  of 
the  b  coefficients  are  smaller  for  CFAR  and  the  biases  asymptotically 
approach  zero.  Also  the  distributions  appear  to  be  normally  distributed 
so  statistical  tests  based  on  the  normal  distribution  can  be  used. 

Key  Words:  Factor  Analysis,  Principal  Components,  Monte  Carlo, 
Factor  Analysis  Regression,  Ordinary  Least  Squares,  Multicollinearity, 
Errors  in  the  Variables,  Outliers. 
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Statistical  Analysis  of  the  Goodness  of 
Classical  Factor  Analysis  Regression  (CFAR) 

John  T.  Scott,  Jr.,  and  Allen  Fleishman 

Regression  from  factor  analysis  has  been  >u.uuf>te«l  by  several  authors 
as  an  alternative  for  ordinary  least-squares  (OLS)  regression  when 
the  explanatory  variables  are  subject  to  error  or  there  is  significant 
multicollinearity  (Kloek  and  Mennes,  1960;  Amemiya,  1966;  Scott, 
1966;  I-nwley,  1073). 

In  the  case  of  nuilticollinearity  when  the  determinant  of  the  ex- 
planatory variables  correlation  matrix  approaches  zero,  it  is  well  known 
that  OLS  can  give  spurious  results.  The  regression  coefficients  fre- 
quently do  not  correspond  to  either  the  theory  or  the  zero-order  cor- 
relation coefficients,  and  the  variances  are  inconsistent.  Also,  when 
there  are  errors  in  the  variables  (which  is  normal  with  economic  data), 
it  has  been  shown  (Johnston,  1963)  that  OLS  regression  coefficients  are 
biased  and  that  the  associated  variances  are  not  only  inconsistent  but 
generally  underestimate  the  true  variances.  These  results  follow  from 
violation  of  two  OLS  assumptions:  that  the  explanatory  variables  are 
independent,  and  that  the  explanatory  variables  are  known,  fixed  num- 
bers without  error.  For  example,  if  we  assume  there  are  errors  in  all 
variables,  then  the  OLS  model  becomes: 

(1)  (y  -  v)  -  b,  (x,  -  uO  +  b2  (x,  -  tt.)  +  ... 

+  bk  (xk— uk)  +  \v 

or  in  matrix  notation: 

(2)  (Y  -  V)  =  (X  -  U)B  +  W; 

where  Y  is  the  n  X  1  vector  of  observed  values  of  the  dependent  variable 

adjusted  for  the  mean, 
V  is  the  n  X  1  vector  of  errors  in  Y, 

X  is  the  n  X  k  matrix  of  n  observations  of  the  k  explanatory  vari- 
ables adjusted  for  the  mean, 
U  is  the  n  X  k  matrix  of  errors  in  X,  and 

W  is  the  n  X  1  vector  of  residuals  from  regression  which  may  in- 
clude specification  errors  as  well  as  other  errors  not  included  in 
V. 

A 

Minimizing  W  with  respect  to  B  results  in 

(3)  B  =  [(X  -  U)'(X  -  U)]-'[(X  -  U)'(Y  -  V)],  or 

B  =  [(X  -  U)'(X  -  U)]-'[X'Y  -  U'Y  -  X'V  +  17V]. 
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We  can  simplify  the  foregoing  expression  by  making  three  additional 
assumptions  not  found  in  the  classic  assumptions  underlying  OLS: 
the  errors  in  Y  are  uncorrelated  with  those  in  X;  the  errors  in  X  are 
uncorrelated  with  Y;  and  the  errors  in  Y  are  uncorrelated  with  X.  These 
assumptions  are  reasonable  and  only  moderately  restrictive.  Then  the 
last  three  terms  in  equation  3  become  zero  and  B  becomes: 

(4)  B  =  [(X  -  U)'(X  -  U)]-JX'Y,   or 
B  =  [X'X  -  2X'U  +  U'U]-1  X'Y. 

Assuming  that  the  errors  in  X  are  independent  of  X  itself  (which 
is  still  another  assumption),  the  middle  term  of  the  inverse  in  equation 
4  drops  and  then  equation  4  becomes : 

(5)  B  =  [X'X  -  U'U]-1  X'Y. 

To  estimate  this  modification  of  the  OLS  model,  we  need  to  know 
as  a  minimum  the  variance-covariance  matrix  of  the  errors  in  X.  The 
problem  is  that  this  is  rarely  if  ever  known  in  the  real  world.  If  we 
make  the  assumption  that  the  errors  in  X  are  uncorrelated  with  each 
other,  then  equation  (5)  becomes  the  ridge  regression  estimator: 

(6)  B  =  [X'X  -  al]-1  X'Y, 

where  a  is  another  parameter  which  must  be  estimated,  which  is  no 
trivial  task  (Marquardt,  1970;  McDonald  and  Galarneau,  1975). 

While  empirical  results  from  factor  analysis  regression  are  sub- 
stantially better  than  those  from  OLS  based  on  a  priori  expectations 
(Amemiya,  1966;  Scott,  1966;  Oehrtman,  1968;  Bursch  et  al.,  1972), 
the  statistical  properties  of  the  estimators  in  factor  analysis  regression 
have  not  been  derived  mathematically,  nor  does  such  a  derivation  ap- 
pear tractable.1  The  alternative  method  generally  acceptable  for  obtaining 
the  statistical  characteristics  of  an  estimator  is  to  perform  a  Monte  Carlo 
study  of  the  estimator.  The  development  of  such  a  study  involving 
classical  factor  analysis  regression  and  its  results  are  reported  here. 

CLASSICAL  FACTOR  ANALYSIS  REGRESSION 

The  factor  analysis  statistical  model  assumes  that  a  large  number  of 
variables  can  be  described  adequately  by  a  smaller  number  of  factors: 

(8)  Z  =  AF  +  U 

1  The  senior  author  has  worked  on  this  problem  and  consulted  others  including 
R.  A.  Wijsman,  Department  of  Mathematics,  and  Leyard  Tucker  and  Charles 
Lewis,  Department  of  Psychology,  University  of  Illinois.  All  suggested  the 
Monte  Carlo  approach. 
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where  Z  is  the  h  X  n  matrix  of  n  observations  of  all  h  real  variables 

involved, 
A  is  the  h  X  m  matrix  of  regression  coefficients,  usually  referred 

to  as  factor  coefficients  or  factor  loadings,  with  m  <  h, 
F  is  the  m  X  n  matrix  of  the  n  values  of  the  m  factors,  and 
U  is  the  h  X  n  matrix  of  the  n  residuals  associated  with  the  h 

variables. 

It  is  assumed  that  E(U)  =  0;  E(F)  =  0;  K(UU')  =  V,  a  diagonal 
matrix;  E(FF')  =  I;  and  E(Z)  =  0;  and  further,  that  U  and  F  are 
independent  and  have  multivariate  normal  distributions. 

A  number  of  methods  have  been  developed  to  "extract  factors"  or 
calculate  the  coefficient  matrix  to  meet  the  foregoing  statistical  assump- 
tions (Hotelling,  1933;  Guttman,  1940;  I^awley,  1940;  Rao,  1955; 
Joreskog,  1962:  and  others). 

A  derivation  of  regression  from  factor  analysis  was  developed 
which,  for  purposes  of  differentiation,  is  called  "classical  factor  analysis 
regression"  or  "see  far  —  CFAR"  (Scott,  1970).  Since  CFAR  is  much 
simpler  and  easier  to  obtain  than  the  earlier  factor  analysis  regression 
derivations,  it  should  appeal  to  practitioners  for  their  research  work. 
The  results  from  CFAR  are  as  good  as,  or  better  than,  those  from  the 
earlier  factor  analysis  regression  methods. 

Using  standardized  variables  in  ordinary  least-squares  regression 
(OLS)  results  in  the  following  equation  to  estimate  the  regression 
coefficients : 

(9)  R  -  Rxx-'Rxy 

A 

where  B  is  the  k  X  1  vector  of  regression  coefficients, 

R«  is  the  k  X  k  correlation  matrix  of  the  explanatory  variables, 

and 
Rjy  is  the  k  X  1  vector  of  correlations  between  the  dependent  and 

the  explanatory  variables. 

The  factor  analysis  statistical  model,  equation  8,  allows  for  errors  in 
all  variables  and  can  be  used  in  situations  involving  high  multicollinear- 
ity.  Factor  analysis  regression  may  also  give  improved  results  over  OLS 
when  the  data  set  contains  a  number  of  extreme  values  or  outliers.  Thus 
the  assumptions  of  factor  analysis  seem  more  appropriate  for  use  with 
real  economic  data  than  do  the  assumptions  of  ordinary  least  squares. 

Let  the  matrix  R  be  the  matrix  of  correlations  among  the  explana- 
tory variables  augmented  with  the  correlations  between  the  dependent 
and  the  explanatory  variable.  This  matrix  has  dimensions  k  +  1  by 
k  +  1.  Using  matrix  R,  obtain  the  factor  loading  matrix  A,  by  least 
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squares  or  maximum-likelihood  (Lawley,  1940;  Whittle,  1952;  Rao, 
1955;  Joreskog,  1962).  Then: 

(10)  AA'  +  V  =  R, 

where  V  is  a  diagonal  matrix  and  is  the  difference  between  diagonal 
(AA')  and  I,  the  identity  matrix;  and  R  is  the  maximum-likelihood 
estimate  of  the  full  correlation  matrix. 

^  ^ 

Then  partition  R  into  Rxx,  the  k  by  k  estimated  correlation  matrix  of 
the  explanatory  variables,  and  Rxy,  the  k  by  1  vector  of  estimated  correla- 
tions between  the  explanatory  variables  and  the  dependent  variable.  Use 
these  estimated  correlations  in  the  OLS  regression  coefficient  estimating 
equation  to  get  the  CFAR  coefficients,  P>,  so  that: 

(11)  B^R^R^, 

The  long-run  efficacy  of  any  statistical  method  at  least  partly  depends 
upon  having  knowledge  of  the  statistical  properties  of  the  method, 
especially  of  the  characteristics  of  the  parameter  estimators.  We  try  to 
obtain  this  knowledge  for  the  CFAR  estimators  in  the  Monte  Carlo 
study. 

CONCEPT  AND  PROCEDURE  OF  THE  MONTE  CARLO  STUDY 

The  concept  of  this  study  was  to  use  observations  of  a  population 
with  a  dependent  variable  that  is  associated  with  observations  of  a  set 
of  explanatory  variables,  all  observations  assumed  to  be  without  sam- 
pling error.  Then  the  OLS  regression  estimators  for  this  set  were 
assumed  to  be  the  parameters  or  expected  values  of  the  estimators.  To 
this  original  population  random  normal  errors  were  added  to  all 
variables.  This  new  population  with  errors  in  all  variables  then  is  the 
observed  population  to  be  sampled  for  the  Monte  Carlo  experiment. 
From  this  set  of  observations  with  measurement  errors,  draw  a  large 
number  of  random  samples  of  various  sample  sizes,  and  estimate  the  re- 
gression for  each  sample  with  CFAR  and  OLS.  Then,  examine  the  popu- 
lation of  coefficients  obtained  from  these  regressions  by  comparing  the 
mean  of  each  estimator  with  its  corresponding  parameter  (whether  or  not 
there  is  a  bias  or  how  the  bias  behaves) ,  and  examine  how  closely  the  dis- 
tribution of  the  estimators  corresponds  to  the  normal  distribution.  Desired 
characteristics  for  the  CFAR  coefficients  would  be  unbiasedness,  effi- 
ciency, and  normality.  Two  additional  important  characteristics  are  com- 
pared for  CFAR  and  OLS.  These  are  the  mean-square  error  for  the  pre- 
diction: 2(Y  --  Y)2,  where  Y  is  the  original  population  value  without 
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A 

error  and  Y  is  the  predicted  value  based  on  the  estimation  from  the  ob- 
served variables  with  error;  and,  the  mean-square  error  for  the  repression 
coefficients:  2(/8  —  j§)2,  where  ft  is  the  OLS  estimate  from  the  original 
Imputation  without  sampling  error  as  the  parameter,  and  ft  is  the  regres- 
sion coefficient  estimated  by  CFAR  and  OLS  from  the  observed  variables 
with  error  included.  I  f  £<  V  -  Y)2  and  S(j8  -  ft)2  estimated  by  CFAR 
are  less  than  when  estimated  by  OLS,  then  this  is  evidence  that  CFAR 
is  in  some  sense  a  better  estimating  procedure.  These  latter  two  criteria 
are  usually  considered  more  important  for  small  sample  size  than  are 
unbiasedness  and  normality. 

MONTE  CARLO  PROCEDURE 

Three  original  populations  were  selected,  each  with  one  dependent 
variable  and  twelve  explanatory  variables.  Then,  using  four  sets  of  asso- 
ciative characteristics  and  two  variable  generating  procedures,  24  popu- 
lations which  are  now  called  initial  imputations  were  generated  having 
various  internal  characteristics.1 

For  this  Monte  Carlo  experiment,  a  substantial  range  was  generated 
in  the  associative  characteristics  because  of  the  wide  range  of  these 
characteristics  found  in  empirical  observations.  For  example,  with  most 
economic  data  more  of  the  intercorrelations  are  positive  than  negative; 
some  socioeconomic  variables  have  high  intercorrelations  —  as  an  ex- 
ample, prices  of  substitutes  or  economic  variables  over  time  and  time 
series;  and  some  socioeconomic  variables  occasionally  have  low  inter- 
correlations, typically  those  from  cross-section  data  and  survey  ques- 
tionnaires. Also,  the  range  in  the  proportion  of  the  variance  of  the 
dependent  variable  explained  by  regression  is  frequently  quite  large. 
Therefore  we  believed  that  it  was  imperative  to  use  different  initial 
populations  representative  of  a  wide  range  of  various  associative 
characteristics. 

Assuming  that  the  observations  in  the  initial  populations  were  with- 
out error,  we  calculated  the  OLS  regression  for  each  of  the  initial  24 
populations  and  assumed  the  coefficients  from  these  regressions  to  be 
the  parameters  or  expected  values  of  the  coefficients  for  each  respective 
initial  population. 

A  random  normal  error  structure  was  added  to  all  variables  in  each 
initial  population  so  that  we  then  had  24  populations  with  errors  in  all 
variables  which  became  the  "observable"  values  to  be  sampled.  Then 

1  By  associative  or  internal  characteristics  is  meant  the  interrelationship  of 
the  variables  within  any  one  initial  population. 
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from  each  of  the  24  populations  with  errors  in  the  variables,  100  sam- 
ples each  of  size  16,  size  64,  and  size  256  were  drawn,  with  the  sampling 
error  structure  potentially  different  with  each  draw,  simulating  drawing 
from  an  infinite  population.  Thus,  there  were  7,200  sample  variance- 
covariance  matrices  drawn  for  this  experiment.  An  OLS  regression  was 
run  for  each  of  the  7,200  samples,  each  with  12  explanatory  variables. 

To  obtain  factor  analysis  regression,  the  sample  correlation  matrix 
must  be  factor-analyzed  and  a  reproduced  correlation  matrix  calculated 
from  the  factor-loading  matrix.  An  important  consideration  in  factor 
analysis  is  the  number  of  factors  to  be  extracted  from  the  sample  corre- 
lation matrix.  The  factor-reproduced  correlation  matrix  will  differ,  de- 
pending upon  the  number  of  factors  extracted.  With  12  explanatory 
variables,  we  believed  a  maximum  of  six  factors  should  be  ample  to 
describe  the  underlying  phenomena.  Not  knowing  the  change  in  char- 
acteristics of  the  CFAR  estimators  that  might  occur  as  a  result  of 
using  different  numbers  of  factors,  we  extracted  and  reproduced  a 
correlation  matrix  from  one  factor,  from  two  factors,  etc.,  up  to  and 
including  six  factors,  using  the  factors  explaining  the  most  cumulative 
variance  in  all  cases.  Thus  from  each  sample  correlation  matrix  there 
were  six  reproduced  correlation  matrices.  A  classical  factor  analysis 
regression  equation  was  estimated  from  each  of  these  six  reproduced 
correlation  matrices,  making  43,200  CFAR  equations,  each  with  12 
explanatory  variables,  that  were  estimated  for  this  Monte  Carlo 
experiment. 

Since  factor  extraction  and  communality  estimation  by  least  squares 
or  maximum-likelihood  is  much  more  expensive  than  obtaining  the 
principal  components,  the  experiment  included  obtaining  the  factors  by 
principal  components  as  well  as  by  a  statistical  routine,  and  calculating 
the  regression  coefficients  the  same  way  from  each  extraction  method  to 
compare  the  results  between  image  factor-analysis  extraction  and  prin- 
cipal components.  There  were  actually  86,400  CFAR  equations  —  half 
using  statistical  factor-analysis  extraction  and  half  using  principal 
components.1 

1 L.  R.  Tucker,  Department  of  Psychology,  University  of  Illinois,  suggested  at 
the  time  we  ran  the  calculations  of  the  experiment  that  we  factor-analyze  only 
the  explanatory  variable  correlation  matrix  rather  than  the  augmented  matrix 
to  save  computer  time  on  such  a  large  experiment.  The  estimating  equation  then 
becomes  B  =  RXJf  *  Rxy  rather  than  B  =  Rxx"1  Rxy.  Although  the  difference  in  re- 
sults is  probably  only  marginal,  we  now  believe  that  conceptually  the  augmented 
matrix  should  be  the  matrix  to  factor-analyze.  We  have  no  way  of  knowing 
whether  a  marginal  improvement  would  have  been  great  enough  to  compensate 
for  the  cost  of  the  extra  calculation. 
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ASSOCIATIVE  CHARACTERISTICS 

Tables  1  through  6  give  the  details  of  the  associative  or  internal 
characteristics  of  each  of  the  initial  24  populations.  Table  1  shows  the 
four  initial  populations  with  the  four  different  associative  characteristic 
ranges  generated  by  the  two- factor  generator  from  the  first  original 
population  (see  appendix).  Table  2  shows  four  additional  initial  popu- 

Table  1.  Characteristics  of  Four  Initial  Populations  Including  Regres- 
sion Coefficients  (1-4)  Produced  With  the  Two-Factor  Generator  From 
Original  Population 

Associative  characteristics 


Correlation 

i\i  n  kit* 

Initial 
population  1 

Initial 
population  2 

Initial                    Initial 
population  3         population  4 

Freq.      Freq. 

Freq.      Freq. 

Freq. 

Freq.        Freq. 

Freq. 

of  rxy     of  r« 

of  rxy     of  r»x 

of  rxy 

of  rxx       of  rxy 

of  r«, 

-1.0to-.9 

-.9to  -.8 

-.8to  -.7 

-.7to  -.6 

—  .  6  to  —  .  5 

2 

-.5to-  4 

-.4to-.3 

1             1 

2 

1 

-.3to-.2 

1 

1             1 

1 

-.2  to  -.1 

3 

1 

2               1 

2 

-.Ito   0 

1            3 

1             7 

3 

8               1 

6 

Oto  .1 

3 

1             4 

3 

14              4 

6 

.  1  to  .  2 

1            5 

1             5 

6 

31 

5 

.2to.3 

1            2 

9 

11               5 

13 

.3  to  .4 

6 

3           11 

1 

8 

.4to.5 

5 

3           19 

12 

.5  to  .6 

2            5 

8 

5 

.6to.7 

2            8 

5 

.7  to.  8 

3           17 

2 

.8to.9 

6 

.9tol.O 

Statistical 

Population  regression 

parameters 

(by  OLS) 

estimator 

Initial 

Initial 

Initial 

Initial 

population  1 

population  2 

population  3     population 

4 

R* 

.6806 

.3474 

.0647 

.1195 

b, 

-.0283 

-.0111 

-.0005 

-.0277 

b, 

.1071 

.1068 

.0593 

.0284 

b, 

.1129 

.0850 

.0526 

.0549 

b4 

.2457 

.1575 

.0904 

.1959 

b, 

-.1121 

-.0930 

-.0467 

-.0450 

b. 

-.0035 

.0077 

.0097 

-.0141 

b, 

.0528 

.0627 

.0312 

.0058 

b. 

.1609 

.1416 

.0764 

.0578 

b, 

-.0537 

-.0443 

-.0193 

-.0197 

bm 

.0881 

.0896 

.0510 

.0222 

b,, 

.1424 

.1262 

.0718 

.0503 

b» 

-.0755 

-.0688 

-.0321 

-.0225 
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lations  with  the  four  associative  ranges  generated  by  the  four- factor 
generator  from  the  first  original  population.  Tables  3  and  4  show  the 
corresponding  eight  additional  initial  populations  generated  from  the 
second  original  population.  Tables  5  and  6  show  the  corresponding  eight 
additional  initial  populations  generated  from  the  third  original 
population. 


Table  2.  Characteristics  of  Four  Initial  Populations  (5-8)  Produced  With 
the  Four-Factor  Generator  From  Original  Population  One 

Associative  characteristics 


Correlation 
ransre 

Initial 
population  1 

Initial 
population  2 

Initial                     Initial 
population  3         population  4 

*  ""a*- 

Freq.      Freq. 

Freq.      Freq. 

Freq. 

Freq.        Freq. 

Freq. 

of  rly     of  r« 

of  rxy     of  r« 

of  rxy     of  r«        of  rxy 

of  rxx 

-l.Oto  -.9 

-.9to  -.8 

-.8to  -.7 

-  .  7  to  -  .  6 

-  .  6  to  —  .  5 

-.5to-.4 

1           12 

-.4to  -.3 

1 

1 

-.3to-.2 

5 

1             3 

4 

-.2  to  -.1 

4 

6 

3                1 

4 

-.Ito   0 

2             5 

1             8 

2 

15                2 

8 

Oto  .1 

1             4 

3           10 

6 

25               4 

13 

.  1  to  .  2 

1             8 

1            10 

4 

20               2 

7 

.  2  to  .  3 

1             7 

2             8 

3               3 

12 

.3to  .4 

2             5 

3           11 

7 

.4to  .5 

6 

1            10 

4 

.  5  to  .  6 

2             5 

5 

.  6  to  .  7 

2             9 

1 

.7  to  .8 

5 

.8to  .9 

.9  to  1.0 

Statistical 

Population  regression 

parameters 

(by  OLS) 

estimator 

Initial 

Initial 

Initial 

Initial 

population  1 

population  2 

population 

3     population 

4 

R* 

.6626 

.316 

.0513 

.1123 

b, 

-.0570 

-  .  0442 

-.0190 

-.0158 

b2 

.1346 

.1230 

.0608 

.0381 

bs 

-.0732 

-.0275 

-.0018 

-.0681 

b< 

.3206 

.1944 

.0999 

.2408 

b. 

.1251 

.0966 

.0408 

.0403 

t* 

-.05333 

-.0265 

-.0046 

-.0336 

by 

.0337 

.0422 

.0216 

.0033 

b. 

.2080 

.1766 

.0846 

.0641 

b, 

.0184 

.0062 

-.0003 

.0112 

bio 

.0554 

.0582 

.0338 

.0134 

bn 

.1953 

.1516 

.0732 

.0775 

b,2 

-.1626 

-  .  1444 

-.0598 

-.0397 

19781 
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The  associative  characteristic  sets  were  developed  on  the  following 
criteria:  Associative  characteristic  set  1  was  to  have  a  high  R2  and  a 
wide  range  of  frequency  of  rxy  and  r^  but  with  a  large  share  of  the 
zero-order  correlations  in  the  upper  range  (above  0.6) ;  set  2  was  to  have 
a  medium  R2  and  zero-order  correlation  coefficients  not  as  high,  but  still 
predominantly  on  the  upper  part  of  the  range;  set  3  was  to  have  a  rela- 


Table  3.  Characteristics  of  Four  Initial   Populations    (9-12)    Produced 
With  the  Two-Factor  Generator  From  Original  Population  Two 


Associative  characteristics 

Correlation 
range 

Initial 
population  1 

Initial 
population  2 

Initial                    Initial 
population  3         population  4 

Freq.      Freq. 

Freq.     Freq. 

Freq.      Freq.        Freq. 

Freq. 

of  r«»     of  r« 

of  rxy     of  rxx 

of  rxy     of  r«       of  rxy 

of  rx« 

-1.0to-.9 

-.9to  -.8 

-.8to-.7 

1 

-.7  to  -.6 

1 

—  .  6  to  -  .  5 

2 

-  5  to  -.4 

3 

2 

-.4to  -.3 

1 

2 

-.3  to  -.2 

2 

4 

3 

-.2  to  -.1 

2 

2 

5 

5 

-   Ito   0 

5 

7 

11 

9 

Oto  .1 

2            2 

2             2 

3 

17              3 

7 

.  1  to  .  2 

2 

1            5 

7 

26              1 

12 

.2to.3 

1            3 

2          13 

2 

7              2 

11 

.3  to  .4 

8 

2             8 

1 

6 

.4to  .5 

2             5 

3           16 

2 

4 

.  5  to  .  6 

5 

2             5 

3 

3 

.6to.7 

2            9 

4 

.7  to  .8 

3           11 

2 

.8to  .9 

2            4 

.9  to  1.0 

Statistical 

Population  regression 

parameters 

(by  OLS) 

estimator 

Initial 

Initial 

Initial 

Initial 

population  1 

population  2 

population 

3     population 

4 

R' 

.7521 

.4180 

.1142 

.3724 

b, 

.1212 

.1082 

.0749 

.0679 

b, 

.1537 

.1345 

.0957 

.1063 

In 

.0803 

.0948 

.0604 

.0238 

bi 

.0575 

.0603 

.0431 

.0321 

b, 

.1241 

.1213 

.0849 

.0655 

b. 

2208 

.1538 

.1089 

.2754 

br 

.0545 

.0597 

.0414 

0259 

b, 

.0003 

-.0023 

.0006 

.0059 

b, 

.0242 

0317 

.0176 

.0017 

b,o 

.0570 

.0687 

.0415 

.0132 

b,, 

.1601 

.1276 

.0926 

.1484 

b,, 

.0054 

.0090 

.0036 

-.0041 

10 
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lively  low  R-  and  a  small  range  of  zero-order  correlation  coefficients; 
and  set  4  was  to  have  a  medium  to  low  R2  with  a  wide  range  of  zero- 
order  correlation  coefficients.  Also,  since  we  were  trying  to  simulate 
socio-economic  variables,  we  had  the  criterion  for  all  sets  that  a  major 
share  of  the  correlations  should  be  positive.  These  objectives  are  met 
reasonably  well  as  shown  by  the  frequency  distribution  of  the  data  and 


Table  4.  Characteristics  of  Four  Initial  Populations   (13-16)   Produced 
With  the  Four-Factor  Generator  From  Original  Population  Two 


Associative  characteristics 

Correlation 

Initial 
population  1 

Initial 
population  2 

Initial                     Initial 
population  3         population  4 

range 

Freq.      Freq. 

Freq.      Freq. 

Freq. 

Freq.        Freq.      Freq. 

of  riy     of  rxx 

of  rxy     of  rxx 

of  rxy 

of  rxx        of  rxy      of  rxx 

-l.Oto  -.9 

-.9to  -.8 

-.8  to  -.7 

-  .  7  to  -  .  6 

-  .  6  to  -  .  5 

-.5  to  -.4 

4 

-.4to  -.3 

1             4 

-.3to-.2 

2 

7 

-  .  2  to  -  .  1 

6 

6 

1             8 

-.Ito   0 

3 

6 

1 

19                          11 

Oto  .1 

1             5 

3             7 

7 

30              3           15 

.  1  to  .  2 

1             5 

2           13 

4 

17              3           13 

.2  to.  3 

1           10 

4           11 

2             7 

.3to  .4 

10 

1           10 

1             6 

.4to  .5 

4            3 

2           10 

2             5 

.  5  to  .  6 

1             5 

1 

.  6  to  .  7 

1             9 

.7  to  .8 

2 

.  8  to  .  9 

.9tol.O 

Statistical 

Population  regression 

parameters 

(by  OLS) 

estimator 

Initial 

Initial 

Initial 

Initial 

population  1 

population  2 

population  3     population  4 

R* 

.7203 

.3623 

.0802 

.3277 

b, 

-.0106 

.0131 

.0035 

.0913 

b. 

.3159 

.2277 

.1345 

.2866 

b, 

.1937 

.1871 

.0910 

.0610 

b4 

-.0272 

-.0040 

.0125 

-.0193 

b, 

-.0791 

-.0305 

-.0014 

-.0688 

b, 

.3458 

.2106 

.1133 

.3370 

b? 

.0574 

.0653 

.0475 

.0303 

b. 

.0260 

.0238 

.0058 

-.0060 

b, 

-.0938 

-.0804 

-.0411 

-.0395 

bio 

.0526 

.0600 

.0388 

.0202 

bn 

.0537 

.0653 

.0614 

.0694 

biz 

.1394 

.1224 

.0586 

.0595 

1978) 
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the  R*'s  in  Tables  1  through  6.  The  range  in  Rrs  for  set  1  is  from 
0.6626  to  0.8147;  set  2  is  from  0.3156  to  0.4682;  set  3  is  from  0.0512  to 
0.1463;  and  set  4  is  from  0.1123  to  0.6084. 

Tables  1  through  6  also  give  the  standardized  OLS  regression  co- 
efficients for  each  of  the  24  initial  insulations.  \Ye  assume  these  regres- 
sion coefficients  are  the  population  parameters  or  expected  values  for 
each  of  the  24  initial  populations.  Tables  1  through  6  also  give  the  values 


Table  5.  Characteristics  of  Four  Initial  Populations   (17-20)   Produced 
With  the  Two-Factor  Generator  From  Original  Population  Three 

Associative  characteristics 


Correlation 
ranpe 

Initial 
population  1 

Initial 
population  2 

Initial                     Initial 
population  3         population  4 

•  aii|£«. 

Freq.      Freq. 

Freq.      Freq. 

Freq.      Freq 

Freq. 

Freq. 

of  r,y     of  TX, 

of  r»y     of  rM 

of  rxy     of  rxx       of  rzy 

Of  Trx 

-1  Oto-.9 

-  9to-.8 

-.8to-.7 

1 

-.7  to  -.6 

3 

1 

-  .  6  to  -  .  5 

1 

1 

-.5  to  -.4 

2 

2 

1 

-.4to-.3 

2 

4 

3 

-.3to-.2 

3 

3 

1 

1 

-.2to-.l 

1            2 

2 

5 

1 

5 

-.Ito   0 

2 

4 

1             9 

3 

Oto  .1 

1             2 

1             6 

3           20 

1 

6 

.Ito.  2 

5 

2             6 

6          27 

2 

10 

.2to.3 

1            3 

1           10 

2            4 

1 

15 

.3to.4 

7 

2             7 

1 

9 

.4  to.  5 

2            5 

5           20 

3 

6 

.  5  to  .  6 

6 

1             2 

2 

2 

.  6  to  .  7 

4             5 

1 

2 

.7  to.  8 

2           15 

1 

.8to.9 

1             2 

.9tol.O 

Statistical 

Population  regression 

parameters  (by  OLS) 

estimator 

Initial 

Initial 

Initial 

Initial 

population  1 

population  2 

population  3 

population 

4 

R* 

.8147 

.4682 

.1463 

.6084 

b, 

.1941 

.1664 

.1263 

.2071 

b. 

.1221 

.1178 

.0881 

.1027 

b, 

.1334 

.1276 

.0941 

.1064 

b« 

.0193 

.0036 

.0072 

.1162 

b, 

.1257 

.1245 

.0917 

.0974 

b. 

.1446 

.1365 

.1028 

.1288 

b, 

.0356 

.0326 

.0245 

.0281 

b, 

.1023 

.0880 

.0692 

.1173 

b, 

.2156 

.1500 

.1120 

.3595 

b,. 

.0858 

.0922 

.0610 

.0392 

bn 

-.0340 

-.0372 

-.0246 

.0049 

bu 

.0589 

.0700 

.0448 

.0199 

12 
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of  the  determinant  of  the  augmented  correlation  matrix  as  some  indi- 
cation of  the  degree  of  multicollinearity.  The  closer  the  determinant  is 
to  zero,  the  greater  is  the  degree  of  multicollinearity.  If  the  R2  is  high, 
then  we  would  expect  the  determinant  of  the  augmented  correlation 
matrix  to  be  near  zero.  But  since  the  highest  R2  of  any  of  the  24  initial 
populations  is  0.7521,  the  small  size  of  the  determinants  also  reflects 
a  high  degree  of  multicollinearity  among  the  explanatory  variables. 


Table  6.  Characteristics  of  Four  Initial  Populations   (21-24)   Produced 
With  the  Four-Factor  Generator  From  Original  Population  Three 

Associative  characteristics 


_        .     .                      Initial 
Correlation           population  1 

Initial 
population  2 

Initial                      Initial 
population  3          population  4 

Freq.      Freq. 
of  rxy     of  rxx 

Freq.      Freq. 
of  rxy      of  rxx 

Freq.      Freq.        Freq.      Freq. 
of  rxy     of  rxx        of  rxv      of  rxx 

-l.Oto  -.9 
-  .  9  to  -  .  8 
-  .  8  to  -  .  7 
-  .  7  to  -  .  6 
-  .  6  to  -  .  5 
-  .  5  to  -  .  4 
-.4to  -.3 

1 

1 
4 

1 

1 

-.3  to  -.2 

5 

5 

3 

-  .  2  to  -  .  1 

2 

6 

2 

8 

-.Ito    0 

2 

9 

2           10 

2 

20               2 

10 

Oto  .1 

1 

9 

1            10 

4 

30               1 

12 

.  1  to  .  2 

1 

2 

2             8 

6 

15               2 

13 

.2  to  .3 

1 

7 

2           13 

3 

12 

.3  to  .4 

1 

8 

2             9 

2 

5 

.4  to  .5 

1 

5 

3             4 

1 

1 

.  5  to  .  6 

2 

7 

1 

1 

.6to  .7 

2 

4 

.7to  .8 

1 

2 

.8to  .9 

.  9  to  1  .  0 

Statistical 

Population  regression 

parameters 

(by  OLS) 

estimator 

Initial 

Initial 

Initial 

Initial 

population  1 

population  2 

population 

3     population  4 

R2 

.7861 

.4179 

.1052 

.5187 

bi 

.0468 

.0457 

.0328 

.0284 

bs 

.1484 

.1339 

.0874 

.1179 

b3 

.2672 

.2169 

.1406 

.2828 

b4 

.1476 

.0845 

.0590 

.3179 

bs 

-.0134 

-.0040 

-.0021 

-.0328 

b6 

.2434 

.2025 

.1270 

.2214 

b? 

.1078 

.0993 

.0611 

.0701 

bs 

-.0491 

-.0343 

-.0229 

-.0842 

b9 

.1992 

.1269 

.0808 

.3373 

bio 

.1687 

.1635 

.0923 

.0782 

bn 

-.0375 

-  .  0280 

-  .  0099 

-.0049 

bi2 

.1529 

.1471 

.0831 

.0829 
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RESULTS 
Efficiency 

Efficiency  refers  to  the  size  of  the  variance  of  an  estimator  relative 
to  the  variance  of  another  estimator  or  a  standard  estimator.  The 
smaller  the  variance  of  an  estimator,  the  more  efficient  the  estimator  is, 
and  the  estimator  with  greatest  efficiency  (often  referred  to  as  the  effi- 
cient estimator)  is  the  estimator  with  the  smallest  variance. 

\Ye  know  from  statistical  theory  (Anderson,  1958)  that  the  OLS 
estimator  is  inefficient  and  that  the  variance  is  unreliable  for  probability 
estimates  when  there  are  errors  in  the  explanatory  variables. 

Thus  one  imix)rtant  characteristic  of  the  CFAR  estimator  to  investi- 
gate is  the  variance  of  this  estimator.  Since  there  are  43,200  equations 
each  with  12  bj  values  and  it  is  impossible  to  make  or  report  all  the  pos- 
sible comparisons  one  might  like  to  make,  the  variances  for  each  esti- 
mator (calculated  from  each  of  the  samples  of  100)  were  summed  and 
averaged  over  the  12  bj  for  certain  Monte  Carlo  variables  such  as  sam- 
ple size  (N  =  16,  64,  256)  for  associative  characteristics  or  the  internal 
population  relationships,  and  for  each  of  the  different  numbers  of  fac- 
tors extracted  from  one  through  six  factors.  We  have  essentially  sum- 
marized the  5,184  variances  related  to  the  Monte  Carlo  variables. 

The  data  relating  the  mean  variances  to  the  sample  size  and  number 
of  factors  extracted  are  given  in  Table  7.  The  mean  variance,  very  small 
when  the  sample  size  is  the  largest  (256),  remains  consistently  small 
for  all  factors  extracted.  The  mean  variance  is  still  quite  small  for  the 
medium  sample  size  (64),  but  tends  to  increase  as  the  number  of 
factors  extracted  is  increased.  The  mean  variance  is  small  even  for 
sample  size  16.  The  fact  that  the  mean  variance  gets  smaller  as  the 
sample  size  increases  is  important  because  it  indicates  that  CFAR  is  a 
consistent  estimator;  that  is,  the  variance  asymptotically  approaches  a 
minimum  as  sample  size  increases. 

Table  7.  Mean  Variance  of  the  Table  8.  Mean  Variance  of  the 
CFAR  Estimators  by  Sample  Size  CFAR  Estimators  by  Sample  Size 
and  Number  of  Factors  Extracted  and  Associative  Characteristics 


Factors                     Sample  size  Asso- 

«-  dative     Sample  size 

tracted      N  —  16        N  —  64       N  —  256  charac-     N  «  15        N  —  64       N  —  256 

teristics 


1  006176  .001803  .000764   

2  010808  .002095  .0005 79  R,    .023146   .004126   .001322 

3  018060  .003286  000960  R,    .026211   .004315   .001317 

4  .027348  004815  001347  R,    .028443   .005483   .001486 

5  .039844  .007020  .002084  R,    .627851   .004976   .001497 

6  .056240  .009330  .002699   
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The  mean  variance  increases  as  the  number  of  factors  extracted  in- 
creases for  both  extraction  methods.  As  the  number  of  factors  extracted 
increases,  the  solution  approaches  the  OLS  solution  and  is  the  same  as 
the  OLS  solution  when  the  maximum  possible  number  of  factors  are 
extracted.  This  result  implies  that  the  CFAR  solutions  are  always  more 
efficient  than  the  OLS  solutions. 

Table  8  relates  the  mean  variances  to  the  four  selected  associative 
characteristics  and  to  the  sample  size.  The  four  sets  of  associative  char- 
acteristics explained  earlier  are  designated  Rj  as  the  populations  with 
high  R2,  R2  as  the  populations  with  medium  R2,  R3  as  the  populations 
with  a  low  R2,  and  R4  with  a  wider  ranging  R2.  Intercorrelations  among 
the  population  variables  also  differ.  While  the  associative  set  with  the 
highest  R2  and  the  highest  intercorrelations  among  the  explanatory 
variables  has  the  smallest  variances  for  the  CFAR  estimators,  the 
average  variances  for  the  other  sets  are  also  small  and  well  behaved. 
The  variances  for  the  CFAR  estimators  drop  sharply  in  magnitude  w^hen 
we  go  from  sample  size  16  to  sample  size  64,  and  again  to  sample  size 
256.  This  is  exactly  the  way  we  would  like  to  have  the  CFAR  estimator 
behave  in  order  to  recommend  it  as  an  extremely  good  estimator  for 
errors-in-the-variables  regression.  The  variances  were  smallest  regard- 
less of  sample  size  for  the  population  characteristics  which  had  the  high- 
est R2  and  the  highest  intercorrelations  among  the  explanatory  variables, 
also  a  very  desirable  feature. 

The  mean  variances  are  related  to  the  number  of  factors  extracted 
and  the  associative  characteristics  in  Table  9.  These  data  illustrate 
again  the  increase  in  variance  as  the  number  of  factors  extracted 
increases.  There  is  little  difference  in  the  variances  from  one  associative 
characteristic  to  another.  Except  when  only  one  factor  is  extracted, 
the  variances  are  smallest  for  the  two  populations  having  the  highest 
R2  and  higher  intercorrelations  among  the  explanatory  variables. 

Table  9.  Mean  Variance  of  the  CFAR  Estimator  by 
Number  of  Factors  and  Associative  Characteristics 


Number 

nf 

Associative  characteristics* 

factors 

Ri 

R2 

R3 

R, 

1 

.003193 

.003121 

.003026 

.002316 

2 

.003620 

.004533 

.005439 

.004384 

3 

.005927 

.007256 

.008763 

.007795 

4 

.009136 

.010879 

.012790 

.011876 

5 

.014480 

.015728 

.017504 

.017552 

6 

.020831 

.022168 

.023303 

.024724 

•  In  this  and  following  tables,   Ri  =  High  r,  Ri  =  Medium  r,   Ra  =  Low 
r,  and  R4  =  Wide  range  r. 


19781 


CLASSICAL   FACTOR   ANALYSIS   REGRESSION 


15 


Normality 

Normality  refers  to  how  closely  the  distribution  of  the  CFAR 
estimator  approaches  the  normal  distribution.  The  method  we  chose  to 
analyze  this  question  was  to  calculate  for  each  Uj  the  higher  moments 
of  the  distribution  (skewness  and  kurtosis)  since  both  skewness  and 
kurtosis  of  the  normal  distribution  are  zero.  The  Kolmogorov-Smirnov 
statistic,  an  alternative  statistic,  was  not  used  because  the  moments  are 
more  sensitive,  particularly  in  the  tails  of  the  distribution.  The  moments 
were  calculated  and  averaged,  again  relating  the  mean  of  the  moments 
to  the  Monte  Carlo  variables. 

Skewness 

Summary  data  for  skewness  are  given  in  Table  10  with  respect  to 
sample  size  and  the  number  of  factors  extracted.  All  values  obtained 
for  skewness  are  small.  Skewness  approaches  zero  as  sample  size 
increases  and  as  the  number  of  factors  extracted  increases.  The  skew- 
ness  in  the  largest  sample  size  is  consistently  small  regardless  of  the 
number  of  factors  extracted. 

Skewness  related  to  sample  size  and  the  four  sets  of  populations 
with  different  associative  characteristics  is  given  in  Table  11.  While 
the  skewness  does  not  seem  to  bear  a  consistent  relationship  among 
the  various  associative  characteristics  for  each  sample  size,  it  is  clear 
again  that  the  skewness  approaches  zero  as  sample  size  increases  —  the 
largest  improvement  being  made  as  the  sample  size  increases  from  16 
to  64. 

Skewness  related  to  associative  characteristics  and  the  number  of 
factors  extracted  is  given  in  Table  12.  The  skewness  declines  consis- 
tently for  all  associative  groups  as  the  number  of  factors  extracted  is 


Table  10.  Mean  Skewness  of  the 
CFAR  Estimator  by  Sample  Size 
and  Number  of  Factors  Extracted 


Num- 
ber 
of 
factors 

Sample  size 

N  -  16 

N-64 

N  -256 

1 
2 
3 
4 
5 
6 

1  .  273076 
.  769982 
.597069 
.402840 
.276784 
.185260 

.209734 
.  200702 
.121915 
.098575 
076460 
.052464 

-.013147 
.042032 
.056393 
.060511 
.043235 
.019319 

Table  11.  Mean  Skewness  of  the 
CFAR  Estimator  by  Sample  Size 
and  Associative  Characteristics 


Asso- 
ciative 
charac- 
teristics 


Sample  size 


16 


N  -  64       N  -  256 


R,  .568742  .148125  .037540 

R,  .727828  .087259  .015515 

R,  .465930  .111601  -.006986 

R4  .574174  .159582  .092826 
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Table  12.  Mean  Skewness  of  the  CFAR  Estimator  by 
Associative  Characteristics  and  Number  of  Factors 


Number 

Associative  characteristics 

r>f 

factors 

Ri 

R2 

R3 

R4 

1 

.  280749 

.419263 

.440154 

.819383 

2 

.359442 

.395388 

.283772 

.311686 

3 

.304982 

.315598 

.  194992 

.218266 

4 

.227333 

.234565 

.  136653 

.  150683 

5 

.  191042 

.185158 

.056596 

.095842 

6 

.145269 

.111230 

.028922 

.057304 

increased,  except  for  group  RI  in  going  from  one  factor  to  two  factors 
extracted.  In  all  cases  the  skewness  is  the  least  for  six  factors  extracted. 
The  salient  points  shown  by  this  Monte  Carlo  experiment  regarding 
skewness  are:  skewness,  while  generally  positive,  is  very  small  in  all 
cases;  skewness  approaches  zero  as  sample  size  increases;  skewness 
approaches  zero  as  the  number  of  factors  increases;  and  skewness  does 
not  seem  to  be  consistently  related  to  the  associative  characteristics. 

Kurtosis 

Kurtosis  was  calculated  and  related  to  the  Monte  Carlo  variables 
in  the  same  way  as  variance  and  skewness.  If  the  distribution  is  not 
kurtotic  relative  to  the  normal  distribution,  the  value  for  kurtosis  is 
equal  to  or  near  zero.  The  mean  kurtpsis  values  related  to  sample  size 
and  number  of  factors  extracted  are  given  in  Table  13.  All  kurtosis 
values  are  small,  and  the  kurtosis  approaches  zero  as  sample  size  in- 
creases. The  sharpest  reduction  in  kurtosis  was  made  in  going  from  sam- 
ple size  16  to  64.  Except  for  an  aberration  at  three  factors  for  the  two 
larger  sample  sizes,  the  kurtosis  also  approached  zero  as  the  number  of 
factors  extracted  increases. 

Kurtosis  related  to  sample  size  and  associative  characteristics  is  given 
in  Table  14.  Here  again  the  kurtosis  values  for  all  combinations  are 
small  and  approach  zero  as  the  sample  size  increases,  with  the  sharpest 
reduction  in  kurtosis  occurring  in  moving  from  the  size  16  sample  to 
sample  size  64.  The  populations  with  high  R2  and  high  intercorrelations 
among  the  explanatory  variables  seem  to  have  the  kurtosis  values  closest 
to  zero  at  all  sample  sizes. 

Kurtosis  values  related  to  associative  characteristics  and  number 
of  factors  extracted  are  given  in  Table  15.  Except  for  an  aberration  for 
group  R!  when  two  factors  are  extracted,  the  consistent  pattern  of 
kurtosis  values  indicates  that  they  approach  zero  as  the  number  of  fac- 
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Table  13.  Mean  Kurtosis  of  the 
CFAR  Estimator  by  Sample  Size 
and  Number  of  Factors 


Num- 
ber of 
factors 

Sample  size 

N  -  16 

N-64 

N-256 

1 

2 
3 
4 
5 
6 

8.746820 
5.510900 
4.436723 
3.701197 
2  989154 
2.260743 

994613 
.504834 
.709071 
.425751 
.  285380 
.097709 

.469777 
.222304 
.352426 
.070094 
.010045 
-.135707 

Table  14.  Mean  Kurtosis  of  the 
CFAR  Estimator  by  Sample  Size 
and  Associative  Characteristics 


fan 

ciative 
charac- 
tenstics 


Sample  size 


-  16        N  -  64       N  -  256 


R!  3.842346  .472932  .087571 

R,  6.658900  .394044  .190024 

Rj  4925442  .469767  .109464 

R<  5.003670  .674829  .272235 


tors  extracted  increases.  Otherwise  there  seems  to  be  no  clear  pattern 
of  how  the  kurtosis  values  relate  to  the  associative  characteristics  at 
corresi>onding  factor  numbers. 

The  findings  on  kurtosis  can  be  stated  as  follows:  all  the  kurtosis 
values  are  small;  kurtosis  values  approach  zero  as  sample  size  in- 
creases; and  kurtosis  values  approach  zero  as  the  number  of  factors 
extracted  increases. 


Bias 

The  bias  is  the  difference  between  the  expected  value  or  the  average 
value  over  all  samples  of  the  estimator  and  the  true  values  or  para- 
meter —  bias  =  E(B)  —  B.  Because  the  large  amount  of  data  restricted 
the  extent  of  investigation  and  presentation  of  each  item,  we  averaged 
the  bias  over  all  parameters  for  comparison  purposes.  The  result  shown 
in  each  cell  of  Table  15  is  the  average  bias  calculated  as  follows  for  each 
subclassification  of  the  Monte  Carlo  experiment: 

(11)  E12  L100  (Bij  -  Bj)/1200 

j  -  1     i  -  1 

where  Bj  is  the  hypothesized  parameter  of  the  j-th  explanatory  variable 
calculated  by  using  OLS  on  the  initial  population  without  error.  Btj 
is  the  estimator  for  the  j-th  explanatory  variable  calculated  from  the 
i-th  sample  after  the  addition  of  random  normal  errors  to  all  observa- 
tions of  all  variables. 

The  results  of  calculation  of  the  bias  are  given  in  Table  16.  The 
bias  of  the  OLS  estimator  after  errors  are  added  to  the  initial  observa- 
tions of  the  variables  relative  to  the  OLS  values  before  errors  are  added 
is  given  in  the  first  row  of  the  table  for  comparison  purposes.  The  re- 
sults are  given  for  both  factor  analysis  and  principal  component  ex- 
traction and  by  sample  size  and  number  of  factors  extracted.  The  OLS 
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Table  15.  Mean  Kurtosis  of  the  CFAR  Estimator  by 
Associative  Characteristics  and  Number  of  Factors 


Number 

Associative  characteristics 

of 

factors 

R! 

R2 

Rs 

R4 

1 

1.939687 

3.001932 

4.525533 

4.147795 

2 

1.426016 

2.013136 

2.320110 

2.558121 

3 

1.999107 

1.810255 

1  .  544697 

1.976898 

4 

1.461554 

1.619087 

1.127421 

1.387992 

5 

1.236110 

1.221622 

.836656 

1.085051 

6 

.  743223 

.819904 

.  654925 

.  745608 

bias  is  small  for  all  three  sample  sizes,  but  does  not  appear  to  be  con- 
sistent, going  from  positive  to  negative  to  positive  nor  does  it  appear  to 
be  asymptotic. 

In  all  cases  the  bias  for  the  CFAR  estimators  is  consistent  and 
negative,  but  asymptotically  approaches  zero  as  sample  size  increases. 
For  the  largest  sample  size,  the  bias  of  the  CFAR  estimators  when  four 
or  more  factors  are  extracted  is  smaller  than  the  bias  of  the  OLS  esti- 
mator. Using  principal  components  rather  than  statistical  factor  analysis 
gives  comparable  results.  While  the  bias  for  factor  analysis  is  equal  to 
or  smaller  than  the  bias  of  principal  components,  the  differences  are 
sufficiently  small  that  the  extra  cost  of  statistical  factor  extraction  rela- 
tive to  the  cost  of  principal  components  appears  to  be  greater  than  the 
advantage  gained. 

The  second  part  of  Table  16  gives  the  bias  by  number  of  factors  ex- 
tracted and  the  internal  population  characteristics.  Here  the  OLS  esti- 
mators also  seem  to  be  inconsistent.  The  CFAR  estimators  are  negatively 
biased  and  in  general  are  smallest  when  there  is  a  high  intercorrelation 
among  the  variables  in  the  population.  There  are  several  cases  among 
these  data  where  the  principal  components  method  results  in  better 
(smaller  bias)  estimates  than  the  statistical  factor  extraction  method. 
Differences  in  the  magnitude  of  the  bias  are  smaller  than  we  had  ex- 
pected among  the  various  populations.  The  high  correlation  populations 
have  the  smallest  bias,  but  the  magnitude  of  the  bias  of  the  CFAR 
estimators  for  the  other  populations  is  about  the  same. 

While  the  most  desirable  outcome  for  the  bias  would  be  if  the 
CFAR  estimators  were  unbiased,  the  bias  does  appear  to  be  well 
behaved;  that  is,  the  bias  is  negative,  consistent,  and  asymptotically  ap- 
proaches zero  as  sample  size  increases.  Also,  the  bias  is  smallest  for 
the  kind  of  population  characteristics  that  we  are  most  concerned  about 
and  for  which  this  estimator  was  initially  developed. 
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Table  16.  Average  Bias  of  B  From  the  True  Value  of  B 


No.  of 

N  -  16 

N  -  64 

N  -  256 

Overall  average 

extracted 

FA 

l'( 

FA 

PC 

FA 

PC 

FA 

PC 

1 

-.0332   - 

.0317 

-.0204 

-.0229 

-.0150  - 

oixo 

-.0229 

-.0244 

2 

-.0168  - 

.0178 

-.0073 

-.0102 

-.0041    - 

.0078 

-.0094 

-.0119 

3 

-.0109  - 

.0113 

-.0035 

-.0041 

-.0020   - 

.0017 

-.0054 

-.0057 

4 

-.0077   - 

0084 

-.0017 

-.0023 

-.0004   - 

.0004 

-.0033 

-.0037 

5 

-.0053   - 

.0070 

-.0012 

-.0020 

-0003   - 

0004 

-.0023 

-.0031 

6 

-.0039  - 

.0056 

-.0009 

-.0016 

-.0003   - 

(HKI5 

-.0017 

-.0028 

OLS 

.0012 

0002 

.0009 

.0006 

Internal  Characteristics 

No.  of 

High  r 

Medium  r 

Low  r 

Wide  range  r 

extracted 

FA 

PC 

FA 

PC 

FA 

PC 

FA 

PC 

1 

-.0221    - 

.0229 

-.0209 

-.0238 

-.0181   - 

.0020 

-.0304 

-.0288 

2 

-.0087   - 

.0096 

-.0102 

-.0113 

-.0099  - 

.0133 

-.0087 

-.0134 

3 

-.0040  - 

.0030 

-.0060 

-.0052 

-.0065   - 

.0088 

-.0054 

-.0057 

4 

-.0015   - 

.0007 

-.0041 

-.0034 

-.0044   - 

.0072 

-.0030 

-.0034 

5 

-.0007   - 

0005 

-.0032 

-.0031 

-.0030   - 

.0058 

-.0021 

-.0031 

6 

-.0009  - 

.0002 

-.0025 

-.0026 

-.0018  - 

.0049 

-.0016 

-.0026 

OLS 

.0002 

-.0010 

.0012 

-.0008 

Note:  FA  means  a  statistical  factor  extraction;  PC  means  extraction  by  principal 
components. 


Table  17.  Average  Mean  Square  Error  of  B  From  the  True  Value  of  B 


No.  of 
factors 
extracted 

N  -  16 

N  -  64 

N  -  256 

Overall  average 

FA     PC 

FA     PC 

FA    PC 

FA    PC 

1 

2 
3 
4 
5 
6 
OLS 

.0120   .0100 
.0133   .0085 
.0194   .0099 
.0282   .0130 
.0406   .0176 
.0570   0244 
.5023 

.0070   .0085 
.0039   .0055 
.0040   .0044 
.0052   .0046 
.0074   .0057 
.0096   .0068 
.0212 

.0059   .0080 
.0024   .0048 
0015   .0030 
.0017   .0025 
.0024   .0027 
.0029   .0030 
.0044 

.0083   .0089 
.0065   .0063 
.0083   .0058 
.0117   .0067 
.0168   .0087 
.0232   .0114 
.1760 

Internal  Characteristics 

No.  of 

factors 
extracted 

Highr 

Medium  r 

Low  r 

Wide  range  r 

FA     PC 

FA     PC 

FA     PC 

FA     PC 

1 
2 
3 
4 
5 
6 
OLS 

.0105   .0121 
.0064    0069 
.0067   .0050 
.0095   .0049 
0148   .0069 
.0212   .0096 
.1580 

.0073   .0075 
.0059   0046 
.0077   .0042 
.0111   .0053 
0160   0072 
.0224   .0099 
.1703 

.0045   .0036 
.0060   .0038 
.0091   .0050 
.0130   .0068 
.0177   .0088 
.0235   .0116 
.1677 

.0110   .0122 
.0077   .0099 
.0096   .0089 
.0132   .0099 
.0186   .0117 
.0256   .0145 
.2079 

Note :  FA  means  a  statistical  factor  extraction ;  PC  means  extraction  by  principal 
components. 
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Loss  Function  of  the  Estimators 

Another  important  criterion  frequently  considered  is  the  loss  func- 
tion of  the  estimators:  the  sum  of  squares  of  the  differences  between 
the  estimator  and  the  parameter  over  the  number  of  samples.  These 
data  are  given  in  Table,  17.  Mathematically  the  values  given  in  this 
table  are: 

(12)  L12  L10°  (§«  -  Bj)2/1200 

j  =1    i  =  1 

with  respect  to  the  particular  cell  designations  of  the  table.  The  notation 
is  the  same  as  for  equation  11.  As  we  look  at  the  loss  function  of  the  esti- 
mators, we  find  first  that  it  is  smaller  in  almost  all  cases  than  the 
corresponding  loss  function  for  the  OLS  estimators.  In  some  cases  the 
difference  is  very  great,  being  37  times  greater  for  the  OLS  estimators 
in  one  case.  Moreover,  the  loss  function  for  the  CFAR  estimators 
becomes  smaller  as  sample  size  increases  for  each  level  of  factor  ex- 
traction, and  the  loss  function  increases  as  the  number  of  factors  ex- 
tracted increases.  If  we  realize  that  the  CFAR  estimator  is  the  same  as 
the  OLS  estimator  in  the  limit  as  the  number  of  factors  extracted  in- 
creases to  be  equal  to  the  number  of  variables,  we  can  then  see  the  logic 
in  the  increase  in  the  loss  function  as  the  number  of  factors  increases. 
Thus  the  loss  function  for  CFAR  estimators  should  always  be  less  than 
for  OLS,  and  would  have  as  its  upper  bound  the  value  of  the  OLS  loss 
function.  On  examining  the  differences  between  statistical  factor  extrac- 
tion and  principal  component  extraction,  we  find  that  at  the  small  sam- 
ple size,  principal  component  extraction  results  in  small  loss  functions 
for  all  levels  of  factors  extracted.  These  results  are  less  pronounced  at 
the  medium  sample  size,  and  statistical  extraction  seems  better  than 
principal  components  at  the  large  sample  size. 

With  respect  to  the  populations  with  different  internal  characteristics, 
the  OLS  loss  function  again  is  uniformly  larger  than  for  the  CFAR 
estimators.  In  most  cases  the  principal  component  estimators  give  better 
results  than  the  statistical  factor  procedure.  Except  for  results  from  ex- 
tracting only  one  factor,  the  loss  function  for  the  CFAR  estimators 
monotonically  increases  with  increasing  number  of  factors  for  both 
the  principal  component  procedure  and  the  factor  analysis  procedure. 
Furthermore,  the  upper  bound  should  again  be  the  OLS  result  when 
all  factors  are  extracted. 

Loss  Function  of  the  Predicted  Values 

If  we  are  concerned  about  prediction  as  well  as  the  structural  param- 
eters, or  if  we  are  mainly  concerned  about  prediction,  as  is  frequently 
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Table  18.  Average  Mean  Square  Error  of  the  Predicted  Value  From  the 
True  Value 


No.  of 

factors 

extracted 


N  -  16 


N  -  64 


FA         PC 


FA 


PC 


N  -256 

"FA      PC" 


Overall  average 


FA 


PC 


1 

9038 

.8921 

.7864   .8191 

.7479   .7971 

.8127 

.8361 

2 

.8435 

Mild 

.6854   .7085 

.  6489   .  6846 

.7259 

.7324 

3 

S744 

.7987 

.6671   .6715 

6280   .  6393 

.7322 

.7031 

4 

.9388 

.8225 

.  6738   .  6676 

.6273   .6292 

.7466 

.7064 

5 

1.0322 

M09 

.6874   .6750 

.6315   .6307 

.7837 

.7222 

6 

1.1552 

.9174 

.7015   .6831 

.6348   .6324 

.8305 

.7743 

OLS 

4.4568 

.7689 

.6424 

1.9560 

Internal  Characteristics 


No.  of 

Highr 

Medium  r 

Low  r 

Wide  range  r 

extracted 

FA 

PC 

FA 

PC 

FA 

PC 

FA 

PC 

1 

5734 

.6336 

.8039 

.8317 

1  (>().>  1 

9959 

.8704 

.8831 

2 

.3816 

.4098 

.7273 

.7271 

1.0015 

9959 

.7796 

.8023 

3 

.3347 

.3376 

.7271 

.7016 

1.0494 

9903 

.7815 

.7716 

4 

.3353 

.3184 

.7496 

.7070 

1  0933     1. 

0019 

.8082 

.7780 

5 

.3538 

.3251 

.7867 

.7218 

1  .  1460     1  . 

0223 

.8483 

.7958 

6 

.3751 

.3354 

.8360 

.7428 

1.2140     1. 

0462 

.8970 

.8202 

OLS 

.8660 

1  9950 

2.8522 

2. 

1110 

Note:  FA  means  a  statistical  factor  extraction;  PC  means  extraction  by  principal 
components. 


the  case,  then  an  important  criterion  is  how  the  loss  function  of  the 
predicted  values  compares  between  CFAR  equations  and  OLS  equations. 
The  results  shown  in  Table  18  were  calculated  as  follows: 


(13) 


-  Y,j)Vl200 


where  Yu  is  the  i,  j-th  population  true  value,  i  =  1,  2, . .  .,  n,  where  n 
is  the  sample  size,  j  —  1,  2, .  . .,  100,  with  100  repeated 
samples. 
YU  is  the  predicted  value. 

Predictions  were  made  by  the  various  procedures  —  factor  analysis  with 
six  different  factors,  principal  components  with  six  different  com- 
ponents, and  OLS. 

The  values  actually  sampled  and  used  to  calculate  the  estimators  and 
the  predictions  were  Yu,  which  are  Yu  4-  Sij,  where  2»j  is  the  random 
normal  error  added  to  the  original  population  values.  Thus  the  loss 
function  is  not  the  mean  square  error  in  the  sample,  which  would  be 

(14)  >«Z     £»   (Y',j  -  Y,j)Vl200, 

j-i  i.i 
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which  is  the  square  of  the  difference  between  the  predicted  and  the 
observed,  where  Y'  is  the  observed,  but  rather  the  loss  function  takes 
into  account  the  difference  between  the  predicted  and  the  true  value. 

The  OLS  loss  is  largest  at  the  small  sample  size.  The  CFAR  loss 
is  only  one-fifth  to  one-fourth  as  large  as  the  OLS  loss  at  the  small 
sample  size,  showing  a  considerable  improvement  over  the  OLS  pre- 
dictions. On  the  small  sample  size,  the  loss  for  the  principal  component 
(PC)  procedure  was  less  than  the  loss  for  the  factor-analysis  (FA) 
procedure  for  all  six  factors.  For  the  medium-  and  large-size  samples, 
the  results  for  the  PC  and  the  FA  procedures  were  very  similar,  with 
first  one  and  then  the  other  being  better.  For  the  medium-size  sample 
CFAR  was  better  than  OLS  for  all  factors  extracted  except  the  first  and 
second. 

The  results  are  even  more  clearly  in  favor  of  CFAR  when  compari- 
sons are  based  on  differences  among  internal  population  characteristics. 
Here  both  the  PC  and  the  FA  methods  result  in  substantially  smaller 
loss  than  OLS  for  all  classifications  of  internal  population  characteris- 
tics, and  for  all  the  numbers  of  factors  or  principal  components  ex- 
tracted. The  best  predictive  ability  for  both  OLS  and  CFAR  occurred 
when  there  was  high  intercorrelation  among  the  variables.  The  predic- 
tive ability  measured  by  the  loss  function  of  the  predicted  values  was 
from  about  100  percent  to  as  much  as  300  percent  better  for  CFAR 
than  OLS.  These  results  are  shown  in  the  first  part  of  Table  18. 

SUMMARY 

A  Monte  Carlo  experiment  was  developed  to  study  the  statistical 
characteristics  for  the  beta  estimators  from  classical  factor  analysis  re- 
gression (CFAR),  which  has  been  proposed  especially  for  estimating 
regressions  when  there  are  errors  in  the  variables  and  when  high  multi- 
collinearity  makes  ordinary  least  squares  inappropriate  or  completely 
infeasible. 

This  experiment  took  100  random  samples  of  each  sample  size  of  16, 
64,  and  256  from  each  of  24  initial  populations  having  four  different  sets 
of  associative  or  internal  characteristics  and  12  explanatory  and  one 
dependent  variable.  Thus  there  were  7,200  samples  drawn.  One  through 
six  factors  were  extracted  from  each  and  CFAR  estimated  from  these 
factors,  making  43,200  CFAR  equations  with  12  explanatory  variables 
each.  The  statistical  properties  of  the  CFAR  estimators  were  then 
analyzed. 

It  was  found  that  the  CFAR  estimators  behave  extremely  well.  The 
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variance  is  small  even  at  small  sample  sizes  and  quickly  approaches  a 
consistent  minimum  level  as  sample  size  is  increased.  For  purposes  of 
using  this  estimator  where  there  is  high  multicollinearity,  it  is  note- 
worthy that  CFAR  estimates  for  the  initial  jxjpulations  which  had 
the  highest  R2  and  the  highest  intercorrdations  among  the  explanatory 
variables  consistently  had  the  smallest  variances.  The  good  small-sample 
results  are  important,  especially  for  those  working  with  time-series  data 
for  which  the  number  of  observations  is  often  limited.  The  variances 
increase  as  the  number  of  factors  extracted  increases,  so  that  if  this 
procedure  were  carried  to  the  limit  (where  the  number  of  factors  ex- 
tracted equaled  the  number  of  variables  —  the  OLS  regression  case),  the 
variances  corresponding  to  OLS  estimation  would  balloon  up  to  the 
OLS  values.  Thus  CFAR  is  clearly  a  very  efficient  estimator  relative 
to  OLS  for  regressions  when  there  are  errors  in  the  variables. 

The  experiment  also  shows  that  the  CFAR  estimators  are  asymptot- 
ically normal  both  as  sample  size  increases  and  as  the  number  of  factors 
increases.  This  is  deduced  from  the  behavior  of  the  third  and  fourth 
moments  (skewness  and  kurtosis),  which  are  both  zero  in  the  normal 
distribution.  Even  for  small  samples  a  normal  or  "t"  distribution  could 
be  used  for  probability  statements  about  the  CFAR  estimators. 

The  CFAR  estimators  are  consistently  negatively  biased,  but  appear 
to  approach  zero  monotonically  as  sample  size  increases.  Two  measures 
often  made  of  estimators  are  comparisons  of  the  mean-square  error  or 
loss  functions  for  the  estimators  themselves  and  of  the  loss  function  of 
the  prediction.  Here,  too,  CFAR  shows  substantial  advantage  over 
OLS,  being  from  100  percent  in  many  cases  to  as  much  as  2,500  percent 
better  than  OLS. 

Thus  the  CFAR  estimator  is  substantially  better  in  several  respects 
than  OLS  for  all  applications  where  there  is  high  multicollinearity  or 
when  there  are  errors  in  the  variables,  regardless  of  sample  size,  and 
CFAR  is  especially  useful  for  small  samples.  We  hypothesize  that 
CFAR  is  also  better  when  the  data  are  plagued  with  outliers. 
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APPENDIX:  GENERATION  OF  THE  ORIGINAL 
AND  SAMPLING  POPULATIONS 

As  this  was  a  Monte  Carlo  investigation,  the  relevance  of  the  results 
totally  depended  on  the  appropriateness  of  the  populations  used;  there- 
fore, this  study  varied  the  characteristics  of  the  population  used  for  the 
study  in  the  same  ways  that  characteristics  have  been  observed  to  vary 
in  many  applied  studies  published  in  the  literature  and  in  other  known 
empirical  work. 

The  population  intercorrelation  matrix  [P]  dictates  in  many  ways 
the  results  of  any  Monte  Carlo  experiment.  The  choice  of  an  empirically 
derived  matrix  from  the  literature  was  rejected,  because  any  single  study 
might  have  had  the  data  generated  in  an  unique  way,  with  an  atypical 
error  structure.  The  choice  was  therefore  between  an  artificial  but  known 
and  controllable  population  generation  technique,  and  an  unknown  and 
controllable  but  natural  data-generation  technique.  The  simulation  of  P 
was  selected.  A  procedure  was  needed  which  allowed  errors  in  both  the 
dependent  and  independent  variables  and  various  degrees  of  inter- 
correlation  (multicollinearity)  among  all  variables,  and  which  paralleled 
the  procedure  that  an  investigator  might  use  in  selecting  variables. 

The  methodology  of  Tucker,  Koopman,  and  Linn  (1969)  was  se- 
lected. Their  procedure  is  based  on  the  latent  causal  (or  factor  analytic) 
model.  For  each  set  of  population  matrices  (there  were  two  replications 
to  be  described  later),  it  was  hypothesized  that  the  investigator  had  at- 
tempted to  select  variables  to  measure  (to  varying  degrees)  the  under- 
lying variables.  It  was  assumed  that  the  criterion  was  measuring  all  the 
latent  variables.  The  degree  to  which  variables  were  measuring  the  un- 
derlying factors  was  simulated  by  random  selection  of  integers  to  sum  to 
four.  This  matrix  was  then  row-normed.  By  this  method  it  was  possible 
to  get  overrepresentation  of  loadings  on  certain  latent  variables.  As 
indicated  by  the  trace  of  the  cross  products,  overrepresentation  did  not 
occur  to  a  prohibitive  degree.  This  matrix,  the  degree  to  which  the  vari- 
ables are  assumed  to  load  on  the  latent  variable  [A],  is  given  in  Table 
Al.  This  matrix  has  been  row-normed  to  be  of  unit  length. 

However,  it  was  assumed  that  in  practice  there  is  a  discrepancy  be- 
tween the  a  priori  correlation  of  the  variables  with  the  causal  variable 
and  the  actual  correlation.  The  true  loadings  on  the  underlying  con- 
structs were  generated  by 

(1)  A  =  ACm  +  DX(1.0  -  C2m)°-s 

where  A  —  a  13  X  4  matrix  of  actual  loadings  on  the  underlying  con- 
structs, 
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Table  Al.  Conceptual  Row-Normed  Loadings  on  the  Latent  Variables 
(A) 


Observed 
variables 

Latent 

variables 

I 

II 

III 

IV 

X. 

94868 

.31623 

.0 

.0 

X, 

.0 

.31623 

.94868 

.0 

X, 

.0 

.94868 

.0 

.31623 

X, 

0 

.0 

.31623 

.94868 

X* 

1.00000 

.0 

.0 

.0 

X. 

.0 

.94868 

0 

.31623 

X: 

.0 

.33333 

.66667 

.66667 

X. 

.0 

.0 

.31623 

.94868 

X, 

1.00000 

0 

.0 

.0 

X,. 

.0 

.31623 

.94868 

.0 

x» 

.0 

.0 

1  00000 

.0 

x» 

0 

.94868 

.31623 

.0 

Y 

.50000 

.50000 

.50000 

.50000 

[Tr(A'A)l/13 

.24231 

.  25855 

.29188 

.20726 

C  =  a  4  X  4  diagonal  matrix  with  constants  for  each  factor  rep- 
resenting the  degree  of  error  in  specif ying  how  well  a 
a  variable  loads  on  a  factor.  Following  Tucker,  Koopman, 
and  Linn,  C  was  generated  by  random  uniform  deviates  in 
the  range  of  0.70  to  0.90.  The  diagonal  elements  of  C  for 
population  replication  1  were  0.83140,  0.71735,  0.74041,  and 
0.82244. 

X  =  a  13  X  4  matrix  of  standardized  normal  deviates,  and 
D  =  a  13  X  13  diagonal  matrix  that  was  used  to  row-normalize 
X  to  unit  length,  where  d,,  = 


Table  A2.  True  Loadings 
for  the  Two-Factor  Pop- 
ulation 1 

Var.       Factor  1        Factor  2 


X, 

.99946 

.03277 

X, 

.49329 

.86987 

X, 

.82605 

56360 

X, 

.43780 

.89907 

x» 

.81502 

-.57943 

X. 

.98863 

15035 

XT 

.73175 

.68157 

X, 

-  19622 

98056 

X, 

.96754 

-  .  25273 

X,. 

.66333 

.74833 

X,, 

.43708 

.89942 

x,s 

.87765 

-.47931 

Y 

15204 

M837 

Table  A3.  True  Loadings  for  the  Four- Factor 
Population  1 


Yar. 


Factor  1   Factor  2   Factor  3   Factor  4 


X, 

.74088 

.25534 

.45529 

-.44262 

X, 

.44165 

.  24887 

.63778 

.57987 

X, 

-.00629 

.89658 

.37995 

.22748 

X, 

.15225 

.39174 

.  24289 

.87428 

X* 

.69137 

-.14441 

-.65570 

.26684 

X. 

21740 

.95499 

-.02229 

.20058 

X; 

23414 

.73280 

41418 

.48646 

X, 

-.16205 

-.02912 

-.03059 

.98588 

X, 

71000 

-.41126 

36330 

-.44134 

X,o 

.  24554 

.65089 

.55396 

.45735 

X,, 

.63180 

-.11712 

.64407 

.41507 

X,, 

.03940 

.80663 

.11615 

-.57819 

Y 

.30116 

-  .  12834 

.19992 

.92350 
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Number  of  Underlying  Factors 

Within  each  population  it  was  desirable  to  vary  the  number  of 
underlying  latent  variables;  two  and  four  variables  were  selected  for 
this  study.  In  the  two-variable  case,  a  linear  sum  of  the  first  two  and  last 
two  columns  of  matrix  A  was  performed.  This  procedure  insured  that 
the  two-  and  four-factor  cases  would  have  similar  effects  from  the  sto- 
chastic nature  of  X  and  C  given  in  Equation  1.  The  loading  matrix  for 
the  two-  and  four-factor  solution  was  row-normalized  to  unity  and 
appears  for  population  1  in  Tables  A2  and  A3. 

Levels  of  Communality 

(2)  R  =  FF'  +  U2 

where   F  =  a  matrix  of  factor  loadings,  and 

U2  =  a  diagonal  matrix  containing  the  proportion  of  uncorrelated 
error  variance  of  each  variable  (uniqueness). 

Furthermore, 

(3)  H2  =  I  -  U2 

where  H2  =  a  diagonal  matrix  containing  the  proportion  of  variance 

each  variable  shares  with  one  another  (i.e.,  communality). 
It  is  desirable  to  vary  the  degree  of  communality  (or  conversely,  the 
degree  of  uniqueness)  into  four  levels:  high  communality,  H2  ~  U(0.70 
-  0.90);  medium  communality,  H2  ~  U(j0.40  —  0.60);  low  communal- 
ity,   H2~U(0.10  -0.30);    and   wide    communality,    H2  ~  U(0.10 - 
0.90).  In  order  to  insure  comparability  across  communality  levels,  the 
four  levels  of  communality  were  selected  to  be  linear  combinations  of 
one  another: 

(4)  h2  medium,  j  =  (h2  high,  j)  —  0.3  (observed  attribute) 

(5)  h2  low,  j  =  (h2  high,  j )  -  0.3  (observed  attribute) 

(6)  h2  wide,  j  =  4(h2  low,  j)  —  0.3  (observed  attribute) 

j  =  l,2,...,  13 

The  diagonal  entries  of  h2  high  for  population  1  were  0.84905, 
0.78175,  0.88172,  0.89331,  0.80636,  0.84063,  0.70259,  0.80925,  0.77029, 
0.78500,  0.82318,  0.75626,  and  0.70785. 

The  final  population  intercorrelation  matrix  is  given  by 

(7)  P  =  FF'  +  U2 

where  F  =  HA 

U2  =  is  given  by  Equation  3,  and 
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H  =  a  diagonal  matrix  containing  elements  given  by  Equation  4, 
5,  or  6  above. 

Therefore,  for  each  population  there  were  two  levels  of  factors  and 
four  levels  of  communalities.  Thus  there  were  eight  population  matrices 
I>er  population  replication. 

The  regression  weights  and  population  multiple  correlation  for  all 
population  1  matrices  are  presented  in  Table  A4. 

Table  A4.  Regression  Weights  and  R2  for  Population  1 


1,' 

Two  factors 

Four  factors 

t\e- 
gres- 

High 

Medium 

Low 

Wide 

High 

Medium 

Low 

Wide 

sion 

com- 

com- 

com- 

com- 

com- 

com- 

com- 

com- 

weights 

mun- 

mun- 

mun- 

mun- 

mun- 

mun- 

mun- 

mun- 

ality 

ality 

ality 

ality 

ality 

ality 

ality 

ality 

b: 

-.0283 

-.0111 

-.0005 

-.0277 

-.0570 

-.0442 

-.0190 

-.0158 

b. 

.1071 

.1068 

.0593 

.0284 

.1346 

.1230 

.0608 

.0381 

bi 

1129 

.0850 

.0526 

.0549 

-.0732 

-.0275 

.0018 

-.0681 

b, 

.2457 

.1575 

.0904 

.1959 

.3206 

.1945 

0999 

.2408 

bt 

-.1121 

-.0930 

-.0467 

-.0450 

.1251 

.0966 

.0408 

.0403 

U 

-.0035 

.0077 

.0097 

-.0141 

-.0533 

-.0265 

-.0046 

-.0336 

b, 

.0528 

.0627 

.0312 

.0058 

.0337 

.0422 

.0216 

.0033 

b, 

.1609 

.1416 

.0764 

.0578 

.2080 

.1766 

.0846 

.0641 

b, 

-.0537 

-.0443 

-.0193 

-.0197 

.0184 

.0062 

-.0003 

.0112 

bio 

.0881 

.0896 

.0510 

.0222 

.0554 

.0582 

.0338 

.0134 

b,, 

.1424 

.1262 

.0718 

.0503 

.1953 

.1516 

.0732 

.0775 

b,, 

-.0755 

-.0688 

-.0321 

-.0225 

-.1626 

-.1444 

-.0598 

-.0397 

R» 

.6806 

.3474 

.0647 

.1195 

.6626 

.3156 

.0513 

.1123 

R 

.8250 

.5894 

.2544 

.3457 

.8140 

.5618 

.2265 

.3350 

Det. 

.3687x10 

-«  .006113 

.3626 

.002494 

.1322x10-'  .02633 

.5247 

.01697 

Population  Replication 

The  matrices  H,  C,  and  X  which  generated  characteristics  of  P  are 
stochastic  in  nature.  To  determine  the  importance  of  such  random  per- 
turbation on  the  model  (i.e.,  to  determine  the  effect  of  experimenter's 
choice  of  variables),  the  above  procedure  was  replicated  two  times,  al- 
lowing A  to  remain  the  same  but  generating  new  random  number  ma- 
trices H,  C,  and  X.  These  computations  were  done  on  an  IBM  370/155 
at  the  University  of  Illinois  Medical  Center  in  Chicago  using  double 
precision  FORTRAN  words.  The  random  number  generator  used  was 
by  Lewis  and  Payne  (1973).  Test  results  from  this  generator  are 
reported  by  Richardson. 

The  regression  weights,  bi,  and  population  multiple  correlation  for 
all  population  2  matrices  are  given  in  Table  A5. 
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Table  AS.  Regression  Weights  and  R2  for  Population  2 


T>  _ 

Two  factors 

Four  factors 

Re- 
gres- 

High 

Medium 

Low 

Wide 

High 

Medium 

Low 

Wide 

sion 

com- 

com- 

com- 

com- 

com- 

com- 

com- 

com- 

weights 

mun- 

mun- 

mun- 

mun- 

mun- 

mun- 

mun- 

mun- 

ality 

ality 

ality 

ality 

ality 

ality 

ality 

ality 

b, 

.1212 

.1082 

.0749 

.0679 

-.0106 

.0131 

.0035 

-.0913 

b, 

.1537 

.1345 

.0957 

.1063 

3159 

.2277 

.  1345 

.2866 

b3 

.0803 

.0948 

.0604 

.0238 

.  1937 

.1871 

.0910 

.0610 

b4 

.0575 

.0603 

.0431 

.0321 

-.0272 

-.0040 

.0125 

-.0193 

b, 

.1241 

.1213 

.0849 

.0655 

-.0791 

-.0305 

-.0014 

-.0688 

b, 

.2208 

.1538 

.1089 

.2754 

-.3458 

.2106 

.1133 

.3370 

b, 

.0545 

.0597 

.0414 

.0259 

.0574 

.0653 

.0475 

.0303 

b,, 

.0003 

-.0023 

.0006 

.0059 

.0260 

.0238 

.0058 

-.0060 

b, 

.0242 

.0317 

.0176 

.0017 

-.0938 

-.0804 

-.0411 

-.0395 

bio 

.0570 

.0687 

.0415 

.0132 

.0526 

.0600 

.0388 

.0202 

bn 

.1601 

.1276 

.0926 

.1484 

.0537 

.0653 

.0614 

0695 

bn 

.0054 

.0090 

.0036 

.0041 

.1394 

.1224 

.0586 

.0595 

R2 

.7521 

.4180 

.1142 

.3724 

.  7203 

.3623 

.0802 

.3277 

R 

.8672 

.6465 

.3379 

.6102 

.8487 

.6019 

.2832 

.5724 

Det. 

.7812x10^  .008115 

.4070 

.006693 

.3055x10-*    .03676 

.5864 

.04599 

Sample  Size  (N) 

Most  standard  errors  depend  on  the  number  of  observations  (N).  It 
was  felt  that  three  levels  should  adequately  span  this  variable  and  allow 
for  possible  quadratic  effects.  N  was  set  at  16,  64,  and  256  observations. 
The  lower  level  was  selected  because  it  yields  very  few  degrees  of 
freedom  while  allowing  the  matrix  to  be  nonsingular.  Pilot  work  sug- 
gested there  would  be  a  lack  of  discrimination  among  techniques  in 
terms  of  MSE  if  the  upper  limit  were  raised.  As  the  variance  is  usually 
related  to  N2,  the  intermediate  level  was  selected  on  the  basis  of  an  inter- 
mediate number  in  a  quadratic  progression.  To  keep  costs  within  reason, 
only  one  intermediate  level  was  chosen. 

Sampling  Replications 

The  next  step  in  the  simulation  was  to  generate  sample  intercorrela- 
tion  matrices  from  each  of  the  sixteen  population  matrices.  Within 
each  sample  size  and  for  each  population,  one  hundred  sample  replica- 
tions were  done.  A  replication  (given  the  underlying  population)  was 
done  by  use  of  Wishart  (1928)  matrices.  Each  population  matrix  was 
Choleski-decomposed  into 

(8)  P  =  TT 

where  T  =  an  upper  triangular  matrix. 
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Samples  from  P  (Wijsman,  1959),  were  generated  by  forming  a 
sum  of  squares  matrix  : 

(9)  S  =  TWT' 

where  W  =  a  Wishart  matrix.  W  itself  can  be  decomposed  into 

(10)  W 


where  G  =  [g,,]  gu  -  N(0,l)  if  i  >  j 
G  gu  ~  XN-i-iif  i  =  j 

gu  =  0  ifi<j 

Therefore, 

(11)  S=(TG)(TG)' 

For  i  >  j,  gu  was  generated  by  the  polar  variant  of  Box  and  Muel- 
ler's (1958)  technique  (program  GGNMP  of  IMSL).  For  i  =  j,  gu 
was  generated  by  a  variant  of  the  Lewis-Payne  generator  (Payne  and 
Lewis,  1971;  and  Sobolewski  and  Payne,  1972),  given  a  random  uni- 
form deviate  (RAXIH'of  SSP). 

A  distribution  of  10,000  samples  of  the  above-normal  and  chi  square 
variates  was  generated  and  was  found  to  be  distributed  according  to 
theoretical  expectations.  The  sample  unbiased  covariances  were  obtained 
by 

(12)  C  =  S/(N-1) 

Thus,  E(C)  =  P.  Each  of  the  population  variances  was  set  arbi- 
trarily to  1.0.  A  sample  of  means  was  generated  by 

(13)  x  =  (N-°-5)Td 

where  d  =  a  vector  of  normal  deviates  generated  by  GGNMP,  as  above. 

Therefore  x  was  the  vector  of  sample  means  sampled  from  the  popu- 
lation matrix  (P)  with  sample  size  N  (N  =  16,  64,  and  256)  and  popu- 
lation mean  zero. 

For  each  level  of  N  there  were  100  sample  replications.  Therefore 
there  were  4,800  mean  and  covariance  matrices  of  12  predictors  and  cri- 
terion (two  population  replications  by  two  levels  of  number  of  factors  by 
four  levels  of  communality  by  three  levels  of  number  of  observations 
(N)  by  100  sample  replications). 
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